162 research outputs found

    Revised Annotations, Sex-Biased Expression, and Lineage-Specific Genes in the Drosophila melanogaster group

    Full text link
    Here, we provide revised gene models for D. ananassae, D. yakuba, and D. simulans, which include UTRs and empirically verified intron-exon boundaries, as well as ortholog groups identified using a fuzzy reciprocal-best-hit blast comparison. Using these revised annotations, we perform differential expression testing using the cufflinks suite to provide a broad overview of differential expression between reproductive tissues and the carcass. We identify thousands of genes that are differentially expressed across tissues in D. yakuba and D. simulans, with roughly 60% agreement in expression patterns of orthologs in D. yakuba and D. simulans. We identify several cases of putative polycistronic transcripts, pointing to a combination of transcriptional read-through in the genome as well as putative gene fusion and fission events across taxa. We furthermore identify hundreds of lineage specific genes in each species with no blast hits among transcripts of any other Drosophila species, which are candidates for neofunctionalized proteins and a potential source of genetic novelty.Comment: Revised manuscript, also available online preprint at G3: Genes, Genomes, Genetics. Gene models, ortholog calls, and tissue specific expression results are available at http://github.com/ThorntonLab/GFF or the UCSC browser on the Thornton Lab public track hub at http://genome.ucsc.ed

    Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans

    Full text link
    We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of D. yakuba, and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting deleterious impacts are common. D. simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. D. simulans displays an excess of high frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited non-coding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.Comment: Revised Version- Accepted at Molecular Biology and Evolutio

    A genomic map of the effects of linked selection in Drosophila

    Get PDF
    Natural selection at one site shapes patterns of genetic variation at linked sites. Quantifying the effects of 'linked selection' on levels of genetic diversity is key to making reliable inference about demography, building a null model in scans for targets of adaptation, and learning about the dynamics of natural selection. Here, we introduce the first method that jointly infers parameters of distinct modes of linked selection, notably background selection and selective sweeps, from genome-wide diversity data, functional annotations and genetic maps. The central idea is to calculate the probability that a neutral site is polymorphic given local annotations, substitution patterns, and recombination rates. Information is then combined across sites and samples using composite likelihood in order to estimate genome-wide parameters of distinct modes of selection. In addition to parameter estimation, this approach yields a map of the expected neutral diversity levels along the genome. To illustrate the utility of our approach, we apply it to genome-wide resequencing data from 125 lines in Drosophila melanogaster and reliably predict diversity levels at the 1Mb scale. Our results corroborate estimates of a high fraction of beneficial substitutions in proteins and untranslated regions (UTR). They allow us to distinguish between the contribution of sweeps and other modes of selection around amino acid substitutions and to uncover evidence for pervasive sweeps in untranslated regions (UTRs). Our inference further suggests a substantial effect of linked selection from non-classic sweeps. More generally, we demonstrate that linked selection has had a larger effect in reducing diversity levels and increasing their variance in D. melanogaster than previously appreciated

    Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content

    Get PDF
    BACKGROUND: Introns comprise a large fraction of eukaryotic genomes, yet little is known about their functional significance. Regulatory elements have been mapped to some introns, though these are believed to account for only a small fraction of genome wide intronic DNA. No consistent patterns have emerged from studies that have investigated general levels of evolutionary constraint in introns. RESULTS: We examine the relationship between intron length and levels of evolutionary constraint by analyzing inter-specific divergence at 225 intron fragments in Drosophila melanogaster and Drosophila simulans, sampled from a broad distribution of intron lengths. We document a strongly negative correlation between intron length and divergence. Interestingly, we also find that divergence in introns is negatively correlated with GC content. This relationship does not account for the correlation between intron length and divergence, however, and may simply reflect local variation in mutational rates or biases. CONCLUSION: Short introns make up only a small fraction of total intronic DNA in the genome. Our finding that long introns evolve more slowly than average implies that, while the majority of introns in the Drosophila genome may experience little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint. Our results suggest that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated. Our finding that GC content and divergence are negatively correlated in introns has important implications for the interpretation of the correlation between divergence and levels of codon bias observed in Drosophila

    Evolution of multiple additive loci caused divergence between Drosophila yakuba and D. santomea in wing rowing during male courtship

    Get PDF
    International audienceIn Drosophila, male flies perform innate, stereotyped courtship behavior. This innate behavior evolves rapidly between fly species, and is likely to have contributed to reproductive isolation and species divergence. We currently understand little about the neurobiological and genetic mechanisms that contributed to the evolution of courtship behavior. Here we describe a novel behavioral difference between the two closely related species D. yakuba and D. santomea: the frequency of wing rowing during courtship. During courtship, D. santomea males repeatedly rotate their wing blades to face forward and then back (rowing), while D. yakuba males rarely row their wings. We found little intraspecific variation in the frequency of wing rowing for both species. We exploited multiplexed shotgun genotyping (MSG) to genotype two backcross populations with a single lane of Illumina sequencing. We performed quantitative trait locus (QTL) mapping using the ancestry information estimated by MSG and found that the species difference in wing rowing mapped to four or five genetically separable regions. We found no evidence that these loci display epistasis. The identified loci all act in the same direction and can account for most of the species difference

    Constraints on the evolution of toxin-resistant Na,K-ATPases have limited dependence on sequence divergence

    Get PDF
    A growing body of theoretical and experimental evidence suggests that intramolecular epistasis is a major determinant of rates and patterns of protein evolution and imposes a substantial constraint on the evolution of novel protein functions. Here, we examine the role of intramolecular epistasis in the recurrent evolution of resistance to cardiotonic steroids (CTS) across tetrapods, which occurs via specific amino acid substitutions to the α-subunit family of Na,K-ATPases (ATP1A). After identifying a series of recurrent substitutions at two key sites of ATP1A that are predicted to confer CTS resistance in diverse tetrapods, we then performed protein engineering experiments to test the functional consequences of introducing these substitutions onto divergent species backgrounds. In line with previous results, we find that substitutions at these sites can have substantial background-dependent effects on CTS resistance. Globally, however, these substitutions also have pleiotropic effects that are consistent with additive rather than background-dependent effects. Moreover, the magnitude of a substitution’s effect on activity does not depend on the overall extent of ATP1A sequence divergence between species. Our results suggest that epistatic constraints on the evolution of CTS-resistant forms of Na,K-ATPase likely depend on a small number of sites, with little dependence on overall levels of protein divergence. We propose that dependence on a limited number sites may account for the observation of convergent CTS resistance substitutions observed among taxa with highly divergent Na,K-ATPases (See S1 Text for Spanish translation)
    • 

    corecore